Search CORE

293 research outputs found

Distinct genealogies for plasmids and chromosome

Author: Achtman Mark
Zhou Zhemin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/12/2014
Field of study

An earlier perspective on the diversity of conjugative elements in microbes [1] attempted to provide a broad audience with an introductory overview of the arcane biology of mobile genetic elements and their terminologies. It might well have been entitled "Plasmids, ICEs, IMEs, and Other Mobile Elements for Dummies," but common sense prevailed. This perspective introduces two related articles in the current issue of PLOS Genetics [2,3] and might have equally aptly been entitled "Antibiotic-Resistant Plasmids and Their Epidemiology for Dummies.

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

How old are bacterial pathogens?

Author: Achtman Mark
Publication venue: The Royal Society Publishing
Publication date: 17/08/2016
Field of study

Only few molecular studies have addressed the age of bacterial pathogens that infected humans before the beginnings of medical bacteriology, but these have provided dramatic insights. The global genetic diversity of Helicobacter pylori, which infects human stomachs, parallels that of its human host. The time to the Most Recent Common Ancestor (tMRCA) of these bacteria approximates that of anatomically modern humans, i.e. at least 100,000 years, after calibrating the evolutionary divergence within H. pylori against major ancient human migrations. Similarly, genomic reconstructions of Mycobacterium tuberculosis, the cause of tuberculosis, from ancient skeletons in South America and mummies in Hungary support estimates of <6,000 years for the tMRCA of M. tuberculosis. Finally, modern global patterns of genetic diversity and ancient DNA studies indicate that during the last 5,000 years plague caused by Yersinia pestis has spread globally on multiple occasions from China and Central Asia. Such tMRCA estimates provide only lower bounds on the ages of bacterial pathogens, and additional studies are needed for realistic upper bounds on how long humans and animals have suffered from bacterial diseases

PubMed Central

Warwick Research Archives Portal Repository

Metagenomics of the modern and historical human oral microbiome with phylogenetic studies on Streptococcus mutans and Streptococcus sobrinus

Author: Achtman Mark
Zhou Zhemin
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 05/10/2020
Field of study

We have recently developed bioinformatic tools to accurately assign metagenomic sequence reads to microbial taxa: SPARSE [1] for probabilistic, taxonomic classification of sequence reads, EToKi [2] for assembling and polishing genomes from short read sequences, and GrapeTree [3], a graphic visualizer of genetic distances between large numbers of genomes. Together, these methods support comparative analyses of genomes from ancient skeletons and modern humans [2,4]. Here we illustrate these capabilities with 784 samples from historical dental calculus, modern saliva and modern dental plaque. The analyses revealed 1591 microbial species within the oral microbiome. We anticipated that the oral complexes of Socransky et al. [5] would predominate among taxa whose frequencies differed by source. However, although some species discriminated between sources, we could not confirm the existence of the complexes. The results also illustrate further functionality of our pipelines with two species that are associated with dental caries, Streptococcus mutans and Streptococcus sobrinus. They were rare in historical dental calculus but common in modern plaque, and even more common in saliva. Reconstructed draft genomes of these two species from metagenomic samples in which they were abundant were combined with modern public genomes to provide a detailed overview of their core genomic diversity

Warwick Research Archives Portal Repository

Formal comment to Pettengill : the time to most recent common ancestor does not (usually) approximate the date of divergence

Author: Achtman Mark
Didelot Xavier
Zhou Zhemin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 09/06/2015
Field of study

In 2013 Zhou et al. concluded that Salmonella enterica serovar Agona represents a genetically monomorphic lineage of recent ancestry, whose most recent common ancestor existed in 1932, or earlier. The Abstract stated ‘Agona consists of three lineages with minimal mutational diversity: only 846 single nucleotide polymorphisms (SNPs) have accumulated in the non-repetitive, core genome since Agona evolved in 1932 and subsequently underwent a major population expansion in the 1960s.’ These conclusions have now been criticized by Pettengill, who claims that the evolutionary models used to date Agona may not have been appropriate, the dating estimates were inaccurate, and the age of emergence of Agona should have been qualified by an upper limit reflecting the date of its divergence from an outgroup, serovar Soerenga. We dispute these claims. Firstly, Pettengill’s analysis of Agona is not justifiable on technical grounds. Secondly, an upper limit for divergence from an outgroup would only be meaningful if the outgroup were closely related to Agona, but close relatives of Agona are yet to be identified. Thirdly, it is not possible to reliably date the time of divergence between Agona and Soerenga. We conclude that Pettengill’s criticism is comparable to a tempest in a teapot

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

Spiral - Imperial College Digital Repository

Accurate reconstruction of bacterial pan- and core genomes with PEPPAN

Author: Achtman Mark
Charlesworth Jane
Zhou Zhemin
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 14/10/2020
Field of study

Bacterial genomes can contain traces of a complex evolutionary history, including extensive homologous recombination, gene loss, gene duplications and horizontal gene transfer. In order to reconstruct the phylogenetic and population history of a set of multiple bacteria, it is necessary to examine their pangenome, the composite of all the genes in the set. Here we introduce PEPPAN, a novel pipeline that can reliably construct pangenomes from thousands of genetically diverse bacterial genomes that represent the diversity of an entire genus. PEPPAN outperforms existing pangenome methods by providing consistent gene and pseudogene annotations extended by similarity-based gene predictions, and identifying and excluding paralogs by combining tree- and synteny-based approaches. The PEPPAN package additionally includes PEPPAN_parser, which implements additional downstream analyses including the calculation of trees based on accessory gene content or allelic differences between core genes. In order to test the accuracy of PEPPAN, we implemented SimPan, a novel pipeline for simulating the evolution of bacterial pangenomes. We compared the accuracy and speed of PEPPAN with four state-of-the-art pangenome pipelines using both empirical and simulated datasets. PEPPAN was more accurate and more specific than any of the other pipelines and was almost as fast as any of them. As a case study, we used PEPPAN to construct a pangenome of ~40,000 genes from 3052 representative genomes spanning at least 80 species of Streptococcus. The resulting gene and allelic trees provide an unprecedented overview of the genomic diversity of the entire Streptococcus genus

Warwick Research Archives Portal Repository

Neutral genomic microevolution of a recently emerged pathogen, salmonella enterica serovar agona

Author: Achtman Mark
Brisse Sylvain
Brown Derek
Cormican Martin
Fanning Seamus
Guttman David S.
Litrup Eva
McCann Angela
Murphy Ronan
Zhou Zhemin
Publication venue: Public Library of Science
Publication date: 01/01/2013
Field of study

Salmonella enterica serovar Agona has caused multiple food-borne outbreaks of gastroenteritis since it was first isolated in 1952. We analyzed the genomes of 73 isolates from global sources, comparing five distinct outbreaks with sporadic infections as well as food contamination and the environment. Agona consists of three lineages with minimal mutational diversity: only 846 single nucleotide polymorphisms (SNPs) have accumulated in the non-repetitive, core genome since Agona evolved in 1932 and subsequently underwent a major population expansion in the 1960s. Homologous recombination with other serovars of S. enterica imported 42 recombinational tracts (360 kb) in 5/143 nodes within the genealogy, which resulted in 3,164 additional SNPs. In contrast to this paucity of genetic diversity, Agona is highly diverse according to pulsed-field gel electrophoresis (PFGE), which is used to assign isolates to outbreaks. PFGE diversity reflects a highly dynamic accessory genome associated with the gain or loss (indels) of 51 bacteriophages, 10 plasmids, and 6 integrative conjugational elements (ICE/IMEs), but did not correlate uniquely with outbreaks. Unlike the core genome, indels occurred repeatedly in independent nodes (homoplasies), resulting in inaccurate PFGE genealogies. The accessory genome contained only few cargo genes relevant to infection, other than antibiotic resistance. Thus, most of the genetic diversity within this recently emerged pathogen reflects changes in the accessory genome, or is due to recombination, but these changes seemed to reflect neutral processes rather than Darwinian selection. Each outbreak was caused by an independent clade, without universal, outbreak-associated genomic features, and none of the variable genes in the pan-genome seemed to be associated with an ability to cause outbreaks

Queen's University Belfast Research Portal

Directory of Open Access Journals

Irish Universities

PubMed Central

Warwick Research Archives Portal Repository

Cork Open Research Archive

Spiral - Imperial College Digital Repository

BlastFrost : fast querying of 100,000s of bacterial genomes in Bifrost graphs

Author: Achtman Mark
Holley Guillaume
Luhmann Nina
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/01/2021
Field of study

BlastFrost is a highly efficient method for querying 100,000s of genome assemblies, building on Bifrost, a dynamic data structure for compacted and colored de Bruijn graphs. BlastFrost queries a Bifrost data structure for sequences of interest, and extracts local subgraphs, enabling the identification of the presence or absence of individual genes or single nucleotide sequence variants. We show two examples using Salmonella genomes, finding within minutes the presence of genes in the SPI-2 pathogenicity island in a collection of 926 genomes; and identifying single nucleotide polymorphisms associated with fluoroquinolone resistance in three genes among 190, 209 genomes. BlastFrost is available at https://github.com/nluhmann/BlastFrost

Warwick Research Archives Portal Repository

The EnteroBase user's guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny and Escherichia core genomic diversity

Author: Achtman Mark
Alikhan Nabil-Fareed
Fan Yulei
HASH(0x5574f40a5930)
Mohamed Khaled
Tyne William
Zhou Zhemin
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 06/12/2019
Field of study

EnteroBase is an integrated software environment which supports the identification of global population structures within several bacterial genera that include pathogens. Here, we provide an overview on how EnteroBase works, what it can do, and its future prospects. EnteroBase has currently assembled more than 300,000 genomes from Illumina short reads from Salmonella, Escherichia, Yersinia, Clostridiodes, Helicobacter, Vibrio, and Moraxella, and genotyped those assemblies by core genome Multilocus Sequence Typing (cgMLST). Hierarchical clustering of cgMLST sequence types allows mapping a new bacterial strain to predefined population structures at multiple levels of resolution within a few hours after uploading its short reads. Case study 1 illustrates this process for local transmissions of Salmonella enterica serovar Agama between neighboring social groups of badgers and humans. EnteroBase also supports SNP calls from both genomic assemblies and after extraction from metagenomic sequences, as illustrated by case study 2 which summarizes the microevolution of Yersinia pestis over the last 5,000 years of pandemic plague. EnteroBase can also provide a global overview of the genomic diversity within an entire genus, as illustrated by case study 3 which presents a novel, global overview of the population structure of all of the species, subspecies and clades within Escherichia

Warwick Research Archives Portal Repository

Mismatch induced speciation in Salmonella: model and data

Author: Achtman Mark
Conrad Donald F
Didelot Xavier
Falush Daniel
Torpdahl Mia
Wilson Daniel J
Publication venue: The Royal Society
Publication date: 01/01/2006
Field of study

In bacteria, DNA sequence mismatches act as a barrier to recombination between distantly related organisms and can potentially promote the cohesion of species. We have performed computer simulations which show that the homology dependence of recombination can cause de novo speciation in a neutrally evolving population once a critical population size has been exceeded. Our model can explain the patterns of divergence and genetic exchange observed in the genus Salmonella, without invoking either natural selection or geographical population subdivision. If this model was validated, based on extensive sequence data, it would imply that the named subspecies of Salmonella enterica correspond to good biological species, making species boundaries objective. However, multilocus sequence typing data, analysed using several conventional tools, provide a misleading impression of relationships within S. enterica subspecies enterica and do not provide the resolution to establish whether new species are presently being formed

PubMed Central

Oxford University Research Archive

The role of China in the global spread of the current cholera pandemic

Author: Achtman Mark
Didelot Xavier
Kan Biao
Li Dongfang
McCann Angela
Ni Peixiang
Pang Bo
Zhou Zhemin
Publication venue: Public Library of Science
Publication date: 01/01/2015
Field of study

Epidemics and pandemics of cholera, a severe diarrheal disease, have occurred since the early 19th century and waves of epidemic disease continue today. Cholera epidemics are caused by individual, genetically monomorphic lineages of Vibrio cholerae: the ongoing seventh pandemic, which has spread globally since 1961, is associated with lineage L2 of biotype El Tor. Previous genomic studies of the epidemiology of the seventh pandemic identified three successive sub-lineages within L2, designated waves 1 to 3, which spread globally from the Bay of Bengal on multiple occasions. However, these studies did not include samples from China, which also experienced multiple epidemics of cholera in recent decades. We sequenced the genomes of 71 strains isolated in China between 1961 and 2010, as well as eight from other sources, and compared them with 181 published genomes. The results indicated that outbreaks in China between 1960 and 1990 were associated with wave 1 whereas later outbreaks were associated with wave 2. However, the previously defined waves overlapped temporally, and are an inadequate representation of the shape of the global genealogy. We therefore suggest replacing them by a series of tightly delineated clades. Between 1960 and 1990 multiple such clades were imported into China, underwent further microevolution there and then spread to other countries. China was thus both a sink and source during the pandemic spread of V. cholerae, and needs to be included in reconstructions of the global patterns of spread of cholera

Public Library of Science (PLOS)

Directory of Open Access Journals

Irish Universities

PubMed Central

Warwick Research Archives Portal Repository

Cork Open Research Archive

FigShare